Skip to content

[Feat] 온보딩 OCR API 구현#206

Merged
jeong1112 merged 20 commits intodevelopfrom
feat/#204-ocr
Mar 26, 2026
Merged

[Feat] 온보딩 OCR API 구현#206
jeong1112 merged 20 commits intodevelopfrom
feat/#204-ocr

Conversation

@jeong1112
Copy link
Copy Markdown
Collaborator

@jeong1112 jeong1112 commented Mar 26, 2026

Related issue 🛠

Work Description 📝

  • 온보딩 과정에서 사용자의 여권 / 비자 문서에서 정보를 내려주는 로직을 구현했습니다.
  • 여권은 국가마다 형식과 언어가 다르지만 하단에 공통 규격인 MRZ가 존재하여 MRZ를 기반으로 파싱하여 응답을 내려줍니다.
  • 몇 번째 자리에 어떤 내용이 담겨져 있는지가 정해져 있어서 인덱스 기반으로 파싱했습니다.
  • 규격으로 지정된 국가 별 코드가 있는데, 자바에서 Locale이라는 것을 통해 이를 제공받을 수 있어서 Country에 담긴 값들을 매핑해주도록 했습니다.
  • 비자 문서는 별도 규격이 없지만 타겟이 한국 거주 유학생이다 보니 단순 문자열 기반 파싱으로 해결했습니다.
  • 두 API 모두 Naver Clova OCR 외부 API를 타다 보니 입력값으로 들어온 사진의 품질과 OCR 결과에 따라 파싱 결과가 null이 될 수도 있습니다.

ScreenShots 📷

To Reviewers 📢

Summary by CodeRabbit

릴리스 노트

  • 새로운 기능

    • 비자 서류 OCR 및 파싱 기능 추가
    • 여권 서류 OCR 및 MRZ 기반 파싱 기능 추가
    • 비자·여권 OCR 업로드용 multipart API 엔드포인트 추가
    • 페이지 단위 텍스트 추출 및 전체 텍스트 반환 기능 추가
  • 리팩토링

    • 문서 처리 로직을 통합된 서비스로 마이그레이션하여 파일 업로드 흐름 간소화
  • 버그 수정

    • 다양한 날짜·국가 표기 처리 일치성 및 내구성 향상

생년월일을 비자 발급일로 파싱하던 오류를 수정함
- 기존에 JPG밖에 지원을 안 했어서 Naver Clova OCR API가 지원하는 JPG, JPEG, PNG에 대해 적용될 수 있도록 수정함.
- OCR 수행 중 내부 로직에서 예외 발생 시 이를 응답으로 명확하게 내려주도록 함
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 26, 2026

📝 Walkthrough

Walkthrough

온보딩용 비자·여권 OCR API를 추가하고, 전역 문서 처리 서비스로 PDF/이미지 텍스트 추출 및 OCR 파이프라인을 도입·리팩토링했으며, 비자/여권 전용 파서·유틸 및 관련 예외/DTO들을 추가했습니다.

Changes

Cohort / File(s) Summary
API 엔드포인트 & DTO
src/main/java/org/sopt/kareer/domain/member/controller/MemberController.java, src/main/java/org/sopt/kareer/domain/member/dto/response/OcrVisaResponse.java, src/main/java/org/sopt/kareer/domain/member/dto/response/OcrPassportResponse.java
POST /onboard/ocr/visa, POST /onboard/ocr/passport 엔드포인트 추가 및 OCR 응답 DTO(OcrVisaResponse, OcrPassportResponse) 추가.
멤버 서비스
src/main/java/org/sopt/kareer/domain/member/service/MemberService.java
getVisaOcr(MultipartFile)getPassportOcr(MultipartFile) 메서드 추가—DocumentProcessingService로 텍스트 추출 후 파서 적용. 예외 변환 로직 포함.
OCR 파서 및 국가 해석기
src/main/java/org/sopt/kareer/domain/member/util/VisaOcrParser.java, src/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.java, src/main/java/org/sopt/kareer/domain/member/util/CountryResolver.java
비자·여권 OCR 텍스트 파싱 구현(MRZ 추출, 날짜 해석, 국가 ISO3 매핑 등).
전역 문서 처리 서비스 & DTO
src/main/java/org/sopt/kareer/global/document/service/DocumentProcessingService.java, src/main/java/org/sopt/kareer/global/document/dto/response/PageText.java
MultipartFile 기반 PDF 및 이미지 텍스트 추출(OCR 통합) 서비스 추가. 페이지 단위 결과(PageText) 제공.
문서 예외·에러 코드
src/main/java/org/sopt/kareer/global/document/exception/DocumentErrorCode.java, src/main/java/org/sopt/kareer/global/document/exception/DocumentException.java
문서 처리 전용 에러 코드 및 예외 타입 추가.
문서 유틸리티
src/main/java/org/sopt/kareer/global/document/util/DocumentDateUtils.java, src/main/java/org/sopt/kareer/global/document/util/DocumentTextUtils.java
다양한 날짜 포맷 파싱 및 텍스트 정규화 유틸 추가.
기존 코드 마이그레이션
src/main/java/org/sopt/kareer/domain/jobposting/util/ResumeContextService.java, src/main/java/org/sopt/kareer/global/external/ai/service/RagEmbeddingService.java
기존 Clova 기반 DocumentProcessingService 호출을 새로운 전역 DocumentProcessingService로 교체하고 임시 파일 생성 로직 제거.
Clova OCR 업데이트 및 레거시 제거
src/main/java/org/sopt/kareer/global/external/clova/service/ClovaOcrService.java, src/main/java/org/sopt/kareer/global/external/clova/exception/..., src/main/java/org/sopt/kareer/global/external/clova/service/DocumentProcessingService.java (삭제), src/main/java/org/sopt/kareer/global/external/clova/dto/response/PageText.java (삭제)
Clova OCR 서비스의 이미지 전처리·인코딩 변경 및 에러 처리 강화. 레거시 clova DocumentProcessingService와 중복 DTO(PageText) 제거.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Controller as MemberController
    participant Service as MemberService
    participant DocSvc as DocumentProcessingService
    participant Clova as ClovaOcrService
    participant Parser as Visa/Passport Parser

    Client->>Controller: POST /onboard/ocr/{visa|passport} (MultipartFile)
    Controller->>Service: get{Visa|Passport}Ocr(file)
    Service->>DocSvc: extractText(file)
    alt PDF 페이지에 텍스트 없음
        DocSvc->>Clova: doOcr(page image)
        Clova-->>DocSvc: ocrText
    else 이미지 파일
        DocSvc->>Clova: doOcr(image)
        Clova-->>DocSvc: ocrText
    end
    DocSvc-->>Service: rawText
    Service->>Parser: parse(rawText)
    Parser-->>Service: VisaInfo / PassportInfo
    Service-->>Controller: Ocr{Visa|Passport}Response
    Controller-->>Client: 200 OK (BaseResponse)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested labels

✨ Feature, 🐯정연🐯

Suggested reviewers

  • hyomee2
  • eraser502

시 🐰

사진 속 비자여권, 글자 춤추네 🥕
클로바와 파서가 손을 잡고,
텍스트는 정리되어 흐르고,
온보딩은 더 부드럽게 열리네,
토끼가 기뻐 뛰며 축하해요 🐇🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 1.75% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed PR 제목은 온보딩 OCR API 구현이라는 주요 변경사항을 명확하고 간결하게 요약하고 있습니다.
Linked Issues check ✅ Passed 비자 여권 사진을 통한 개인 정보 추출 API 구현 요구사항이 완전히 충족되었습니다. 패스포트(MRZ 기반), 비자(규칙 기반) 파싱, Clova OCR 통합, 컨트롤러 엔드포인트 모두 구현되었습니다.
Out of Scope Changes check ✅ Passed 문서 처리 서비스 리팩토링(Clova → 일반 DocumentProcessingService 마이그레이션)과 ResumeContextService, RagEmbeddingService 업데이트는 OCR API와 밀접한 관련이 있습니다.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/#204-ocr

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (5)
src/main/java/org/sopt/kareer/domain/member/util/CountryResolver.java (2)

13-13: 인스턴스 필드에 상수 명명 규칙 사용

ISO3_MAPstatic final이 아닌 인스턴스 필드이므로, Java 명명 규칙에 따라 camelCase(iso3Map)를 사용하는 것이 적절합니다.

♻️ 제안된 수정
-    private final Map<String, Country> ISO3_MAP;
+    private final Map<String, Country> iso3Map;

이후 사용처도 함께 변경해야 합니다.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/org/sopt/kareer/domain/member/util/CountryResolver.java` at
line 13, The field ISO3_MAP in CountryResolver is an instance field but uses
constant-style naming; rename ISO3_MAP to iso3Map to follow Java camelCase
conventions and update all references/usages of the field (constructors,
methods, tests) accordingly; ensure access modifiers and behavior remain
unchanged and run tests/compilation to confirm no remaining references to
ISO3_MAP.

31-32: 예외 무시(silent swallow)로 인한 디버깅 어려움

catch 블록은 국가 매핑 실패 원인을 파악하기 어렵게 만듭니다. 최소한 디버그 레벨 로깅을 추가하면 운영 환경에서 매핑 누락 문제를 진단하는 데 도움이 됩니다.

♻️ 제안된 수정

클래스에 @Slf4j 추가 후:

             } catch (Exception ignored) {
+                log.debug("Failed to resolve ISO3 for country: {}", country.getCountryName());
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/org/sopt/kareer/domain/member/util/CountryResolver.java` around
lines 31 - 32, The empty catch in CountryResolver silently swallows exceptions
making mapping failures hard to debug; add Lombok `@Slf4j` to the CountryResolver
class and replace the empty catch(Exception ignored) block with a log call
(e.g., log.debug or log.warn) that includes context and the exception (for
example: "Failed to resolve country for input {}" plus the exception) so
failures are recorded for diagnostics while preserving existing behavior.
src/main/java/org/sopt/kareer/global/document/exception/DocumentException.java (1)

10-12: 들여쓰기 불일치

두 번째 생성자의 super 호출이 2칸 들여쓰기를 사용하고 있어 첫 번째 생성자(4칸)와 일치하지 않습니다.

♻️ 제안된 수정
     public DocumentException(DocumentErrorCode errorCode, String message) {
-      super(errorCode, message);
+        super(errorCode, message);
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@src/main/java/org/sopt/kareer/global/document/exception/DocumentException.java`
around lines 10 - 12, The second constructor in DocumentException (public
DocumentException(DocumentErrorCode errorCode, String message)) has its
super(...) call indented with 2 spaces while the first constructor uses 4
spaces; update the indentation of the super(errorCode, message); line in that
constructor to match the 4-space indentation style used for the first
constructor so formatting is consistent across both constructors.
src/main/java/org/sopt/kareer/domain/member/controller/MemberController.java (1)

164-178: 신규 OCR API에도 예외 응답 명세를 추가해 주세요.

다른 엔드포인트와 달리 @CustomExceptionDescription이 없어 Swagger 에러 계약이 비어 있습니다. 클라이언트 연동 안정성을 위해 동일하게 명세하는 것이 좋습니다.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/org/sopt/kareer/domain/member/controller/MemberController.java`
around lines 164 - 178, Add the same `@CustomExceptionDescription` metadata to the
two OCR endpoints so their Swagger error contracts match the rest of the
controller: annotate the getVisaInfo and getPassportInfo methods in
MemberController with the same `@CustomExceptionDescription` annotation (and
parameters/codes) used on other endpoints in this class so the error responses
are documented consistently for these POST /onboard/ocr/visa and POST
/onboard/ocr/passport handlers.
src/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.java (1)

79-81: 정규식은 매 호출마다 컴파일하지 말고 상수로 재사용하세요.

collectCandidates에서 Pattern.compile을 반복 호출하고 있어 불필요한 오버헤드가 있습니다. 클래스 상수로 올려 재사용하는 편이 낫습니다.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.java`
around lines 79 - 81, In collectCandidates, avoid compiling the regex on every
call; extract Pattern.compile("[A-Z0-9<]{20,}") into a reusable static final
Pattern (e.g., MRZ_PATTERN) at class level in PassportOcrParser and replace the
local Pattern.compile(...) usage with MRZ_PATTERN.matcher(upper) so the same
compiled Pattern is reused across invocations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@src/main/java/org/sopt/kareer/domain/member/controller/MemberController.java`:
- Around line 167-177: Remove the checked IOException from the controller method
signatures and ensure OCR failures are converted to a domain/runtime exception
and mapped to a standard error response: stop declaring "throws IOException" on
getVisaInfo/getPassportInfo and instead handle IO errors coming from
memberService.getVisaOcr(file) and memberService.getPassportOcr(file) by either
(a) catching IOException in the controller and rethrowing a custom unchecked
OcrProcessingException (or returning a standardized BaseResponse error), or (b)
converting IOException inside the service into a domain exception; add an
OcrProcessingException (or reuse a suitable domain exception) and
register/extend the global exception handler (e.g., GlobalExceptionHandler) to
translate that exception into the unified error response and proper HTTP status.
Ensure you reference memberService.getVisaOcr and memberService.getPassportOcr
when making the change.

In `@src/main/java/org/sopt/kareer/domain/member/service/MemberService.java`:
- Around line 184-195: The getVisaOcr and getPassportOcr methods currently
propagate IOException from DocumentProcessingService.extractText; instead catch
IOException inside these methods and convert it to a domain/runtime exception
(e.g., DocumentProcessingException or a checked-to-unchecked wrapper used
elsewhere like ResumeContextService) so the service layer does not expose
checked IO exceptions; update getVisaOcr and getPassportOcr to try-catch around
extractText, log or include the cause, and throw the chosen domain exception
while preserving the original exception as the cause.

In `@src/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.java`:
- Around line 27-37: The PassportOcrParser currently logs sensitive PII (rawText
and MRZ contents) via log.info using variables rawText and mrz (from
DocumentTextUtils.normalize and extractMrz); remove or replace these logs so PII
is not written to INFO: either delete the log.info calls referencing rawText and
mrz or log non-sensitive metadata only (e.g., masked values or presence/length
flags) before returning PassportInfo, ensuring no full name, DOB, document ID,
or MRZ lines are emitted.

In `@src/main/java/org/sopt/kareer/domain/member/util/VisaOcrParser.java`:
- Around line 40-42: In VisaOcrParser.parse, stop logging the full normalized
OCR string (DocumentTextUtils.normalize result) to avoid leaking PII; instead
log only non-identifying metadata such as the normalized text length and a
parse-success indicator (or a stable hash/fingerprint if you need correlation)
using the existing logger (log.info) and keep any detailed errors at debug level
without including raw text; update references in parse to remove the rawText log
and replace it with e.g. length/hashed summary and success/failure information.

In
`@src/main/java/org/sopt/kareer/global/document/service/DocumentProcessingService.java`:
- Around line 46-47: The image/PDF content-type checks in
DocumentProcessingService are too strict and fail when Content-Type is missing
or set to application/octet-stream; update the logic so that when handling
uploads (methods calling isImage(...) / extractPagesFromImage(...) and the PDF
branch using isPdf(...) / extractPagesFromPdf(...)) you fall back to the
uploaded filename extension (use the original filename from the MultipartFile or
equivalent) — e.g., enhance or add a helper like isImageByExtension(filename)
and isPdfByExtension(filename) and change the branches to use: if
(isImage(contentType) || isImageByExtension(filename)) return
extractPagesFromImage(file); and similarly for PDF (lines around
extractPagesFromPdf) so JPEG/JPG/PNG files with missing MIME types are correctly
recognized.

In `@src/main/java/org/sopt/kareer/global/document/util/DocumentDateUtils.java`:
- Around line 49-51: DocumentDateUtils의 2자리 연도 처리에서 고정 컷오프(>=50)를 사용해 미래 생년월일이
만들어지는 문제가 있습니다; 대신 parse/convert 로직(해당 메서드에서 year += (year >= 50 ? 1900 : 2000);
및 return LocalDate.of(...))을 변경해 먼저 2자리 연도를 1900/2000 기준으로 변환한 뒤(LocalDate 생성
시), 생성된 날짜가 오늘(today)보다 미래라면 year -= 100 하여 100년 뒤로 보정하는 동적 피벗 방식을 사용하도록 수정하세요.
Ensure the adjustment happens before returning the LocalDate and keep validation
of month/day as-is.
- Around line 29-31: The code in DocumentDateUtils force-uppercases the input
(the variable `value` via `.toUpperCase(Locale.ROOT)`), which breaks parsing of
English month names for the `DateTimeFormatter` patterns `d MMM yyyy`/`dd MMM
yyyy`; remove the `.toUpperCase(Locale.ROOT)` call from the `value`
normalization or alternatively make the formatters case-insensitive by using
`DateTimeFormatterBuilder.parseCaseInsensitive()` when building the
`DateTimeFormatter` instances used by the parsing methods (keep references to
the `value` variable and the formatter-building logic in DocumentDateUtils to
locate where to change).

---

Nitpick comments:
In
`@src/main/java/org/sopt/kareer/domain/member/controller/MemberController.java`:
- Around line 164-178: Add the same `@CustomExceptionDescription` metadata to the
two OCR endpoints so their Swagger error contracts match the rest of the
controller: annotate the getVisaInfo and getPassportInfo methods in
MemberController with the same `@CustomExceptionDescription` annotation (and
parameters/codes) used on other endpoints in this class so the error responses
are documented consistently for these POST /onboard/ocr/visa and POST
/onboard/ocr/passport handlers.

In `@src/main/java/org/sopt/kareer/domain/member/util/CountryResolver.java`:
- Line 13: The field ISO3_MAP in CountryResolver is an instance field but uses
constant-style naming; rename ISO3_MAP to iso3Map to follow Java camelCase
conventions and update all references/usages of the field (constructors,
methods, tests) accordingly; ensure access modifiers and behavior remain
unchanged and run tests/compilation to confirm no remaining references to
ISO3_MAP.
- Around line 31-32: The empty catch in CountryResolver silently swallows
exceptions making mapping failures hard to debug; add Lombok `@Slf4j` to the
CountryResolver class and replace the empty catch(Exception ignored) block with
a log call (e.g., log.debug or log.warn) that includes context and the exception
(for example: "Failed to resolve country for input {}" plus the exception) so
failures are recorded for diagnostics while preserving existing behavior.

In `@src/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.java`:
- Around line 79-81: In collectCandidates, avoid compiling the regex on every
call; extract Pattern.compile("[A-Z0-9<]{20,}") into a reusable static final
Pattern (e.g., MRZ_PATTERN) at class level in PassportOcrParser and replace the
local Pattern.compile(...) usage with MRZ_PATTERN.matcher(upper) so the same
compiled Pattern is reused across invocations.

In
`@src/main/java/org/sopt/kareer/global/document/exception/DocumentException.java`:
- Around line 10-12: The second constructor in DocumentException (public
DocumentException(DocumentErrorCode errorCode, String message)) has its
super(...) call indented with 2 spaces while the first constructor uses 4
spaces; update the indentation of the super(errorCode, message); line in that
constructor to match the 4-space indentation style used for the first
constructor so formatting is consistent across both constructors.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 17a9fc2f-77d4-4c04-9902-2e7a0731ad3b

📥 Commits

Reviewing files that changed from the base of the PR and between 7d7fce3 and d83ee73.

📒 Files selected for processing (21)
  • src/main/java/org/sopt/kareer/domain/jobposting/util/ResumeContextService.java
  • src/main/java/org/sopt/kareer/domain/member/controller/MemberController.java
  • src/main/java/org/sopt/kareer/domain/member/dto/response/OcrPassportResponse.java
  • src/main/java/org/sopt/kareer/domain/member/dto/response/OcrVisaResponse.java
  • src/main/java/org/sopt/kareer/domain/member/entity/enums/VisaType.java
  • src/main/java/org/sopt/kareer/domain/member/service/MemberService.java
  • src/main/java/org/sopt/kareer/domain/member/util/CountryResolver.java
  • src/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.java
  • src/main/java/org/sopt/kareer/domain/member/util/VisaOcrParser.java
  • src/main/java/org/sopt/kareer/global/document/dto/response/PageText.java
  • src/main/java/org/sopt/kareer/global/document/exception/DocumentErrorCode.java
  • src/main/java/org/sopt/kareer/global/document/exception/DocumentException.java
  • src/main/java/org/sopt/kareer/global/document/service/DocumentProcessingService.java
  • src/main/java/org/sopt/kareer/global/document/util/DocumentDateUtils.java
  • src/main/java/org/sopt/kareer/global/document/util/DocumentTextUtils.java
  • src/main/java/org/sopt/kareer/global/external/ai/service/RagEmbeddingService.java
  • src/main/java/org/sopt/kareer/global/external/clova/dto/response/PageText.java
  • src/main/java/org/sopt/kareer/global/external/clova/exception/ClovaErrorCode.java
  • src/main/java/org/sopt/kareer/global/external/clova/exception/ClovaException.java
  • src/main/java/org/sopt/kareer/global/external/clova/service/ClovaOcrService.java
  • src/main/java/org/sopt/kareer/global/external/clova/service/DocumentProcessingService.java
💤 Files with no reviewable changes (2)
  • src/main/java/org/sopt/kareer/global/external/clova/dto/response/PageText.java
  • src/main/java/org/sopt/kareer/global/external/clova/service/DocumentProcessingService.java

Comment thread src/main/java/org/sopt/kareer/domain/member/controller/MemberController.java Outdated
Comment thread src/main/java/org/sopt/kareer/domain/member/service/MemberService.java Outdated
Comment thread src/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.java Outdated
Comment thread src/main/java/org/sopt/kareer/domain/member/util/VisaOcrParser.java Outdated
Comment thread src/main/java/org/sopt/kareer/global/document/util/DocumentDateUtils.java Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
src/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.java (1)

150-170: 불필요한 try-catch 블록입니다.

CountryResolver.resolveIso3()는 예외를 던지지 않고 알 수 없는 코드에 대해 null을 반환합니다(CountryResolver.java:47-50 참조). try-catch 래퍼는 제거해도 됩니다.

♻️ 리팩토링 제안
 private Country extractCountry(String line1, String line2) {
-    try {
-        if (line1 != null) {
-            String fitted = normalizeLineLength(line1);
-            String code = substringSafely(fitted, 2, 5);
-            Country c = countryResolver.resolveIso3(code);
-            if (c != null) return c;
-        }
-
-        if (line2 != null) {
-            String fitted = normalizeLineLength(line2);
-            String code = substringSafely(fitted, 10, 13);
-            return countryResolver.resolveIso3(code);
-        }
-
-        return null;
-
-    } catch (Exception e) {
-        return null;
+    if (line1 != null) {
+        String fitted = normalizeLineLength(line1);
+        String code = substringSafely(fitted, 2, 5);
+        Country c = countryResolver.resolveIso3(code);
+        if (c != null) return c;
     }
+
+    if (line2 != null) {
+        String fitted = normalizeLineLength(line2);
+        String code = substringSafely(fitted, 10, 13);
+        return countryResolver.resolveIso3(code);
+    }
+
+    return null;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.java`
around lines 150 - 170, The try-catch in extractCountry is unnecessary because
CountryResolver.resolveIso3 does not throw; remove the surrounding try-catch
block in the extractCountry method and keep the existing logic: check line1 ->
normalizeLineLength(line1) -> substringSafely(...,2,5) ->
countryResolver.resolveIso3(code) and if non-null return it; otherwise check
line2 using normalizeLineLength(line2) and substringSafely(...,10,13) and return
countryResolver.resolveIso3(code); preserve the final return null path and
existing helper calls (normalizeLineLength, substringSafely,
countryResolver.resolveIso3) but eliminate the exception handling wrapper.
src/main/java/org/sopt/kareer/domain/member/service/MemberService.java (1)

184-197: 이전 리뷰에서 지적된 IOException 노출 문제가 해결되었습니다.

DocumentProcessingService.extractText()IOException을 도메인 예외(DocumentException)로 적절히 변환하고 있습니다.

다만, 현재 DocumentException 생성자는 (ErrorCode errorCode, String messageDetail) 형태만 지원하므로, 디버깅을 위해 원본 예외 정보를 messageDetail 파라미터로 전달하는 것을 권장합니다:

♻️ 원본 예외 정보를 messageDetail로 포함
         } catch(Exception e) {
             throw new DocumentException(
-                    DocumentErrorCode.OCR_PROCESSING_FAILED
+                    DocumentErrorCode.OCR_PROCESSING_FAILED,
+                    e.getMessage()
             );
         }

또는 더 자세한 정보가 필요하면:

         } catch(Exception e) {
             throw new DocumentException(
-                    DocumentErrorCode.OCR_PROCESSING_FAILED
+                    DocumentErrorCode.OCR_PROCESSING_FAILED,
+                    e.toString()
             );
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/org/sopt/kareer/domain/member/service/MemberService.java`
around lines 184 - 197, In getVisaOcr in MemberService, wrap and propagate the
original exception details into the DocumentException so debugging info isn't
lost: when catching generic Exception (or specifically IOException rethrown from
DocumentProcessingService.extractText), call the DocumentException constructor
that accepts (ErrorCode, String messageDetail) and pass the original exception's
message (e.getMessage()) or a combined message with context (e.g. "OCR failed: "
+ e.getMessage()), and optionally include exception class name; keep the thrown
type DocumentException and do this inside the catch(Exception e) block
referenced in getVisaOcr.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@src/main/java/org/sopt/kareer/global/external/clova/service/ClovaOcrService.java`:
- Around line 61-71: The ClovaOcrService builds a ClovaException including raw
provider response (errorBody) which then gets returned to clients; change the
code so the detailed provider body is only logged via log.error and not embedded
in the ClovaException message — create the ClovaException using safe metadata
(e.g., "CLOVA OCR error. status=<status>") and
ClovaErrorCode.EXTRACT_IMAGE_FAILED, while keeping the full errorBody and any
exception details in the log.debug/error call; apply the same change to the
second occurrence that constructs a ClovaException around lines handling
clientResponse/bodyToMono to ensure external messages are generalized.
- Around line 77-85: The current check in ClovaOcrService that returns "" for
response==null, images==null/empty, or fields==null masks provider errors;
change the logic so only fields.isEmpty() is treated as a valid empty OCR result
(return ""), while response==null, response.images()==null,
response.images().isEmpty(), or fields==null should throw an exception (e.g., an
ExternalServiceException or propagate a runtime exception) so
DocumentProcessingService sees provider failures instead of silently treating
them as empty pages; adjust the checks around response, images(), and fields()
accordingly in the method that reads the OCR response.

---

Nitpick comments:
In `@src/main/java/org/sopt/kareer/domain/member/service/MemberService.java`:
- Around line 184-197: In getVisaOcr in MemberService, wrap and propagate the
original exception details into the DocumentException so debugging info isn't
lost: when catching generic Exception (or specifically IOException rethrown from
DocumentProcessingService.extractText), call the DocumentException constructor
that accepts (ErrorCode, String messageDetail) and pass the original exception's
message (e.getMessage()) or a combined message with context (e.g. "OCR failed: "
+ e.getMessage()), and optionally include exception class name; keep the thrown
type DocumentException and do this inside the catch(Exception e) block
referenced in getVisaOcr.

In `@src/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.java`:
- Around line 150-170: The try-catch in extractCountry is unnecessary because
CountryResolver.resolveIso3 does not throw; remove the surrounding try-catch
block in the extractCountry method and keep the existing logic: check line1 ->
normalizeLineLength(line1) -> substringSafely(...,2,5) ->
countryResolver.resolveIso3(code) and if non-null return it; otherwise check
line2 using normalizeLineLength(line2) and substringSafely(...,10,13) and return
countryResolver.resolveIso3(code); preserve the final return null path and
existing helper calls (normalizeLineLength, substringSafely,
countryResolver.resolveIso3) but eliminate the exception handling wrapper.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: a91c5ffc-c9bf-470c-b80f-ef55870d0fde

📥 Commits

Reviewing files that changed from the base of the PR and between d83ee73 and a136e09.

📒 Files selected for processing (8)
  • src/main/java/org/sopt/kareer/domain/member/controller/MemberController.java
  • src/main/java/org/sopt/kareer/domain/member/service/MemberService.java
  • src/main/java/org/sopt/kareer/domain/member/util/PassportOcrParser.java
  • src/main/java/org/sopt/kareer/domain/member/util/VisaOcrParser.java
  • src/main/java/org/sopt/kareer/global/document/exception/DocumentErrorCode.java
  • src/main/java/org/sopt/kareer/global/document/service/DocumentProcessingService.java
  • src/main/java/org/sopt/kareer/global/document/util/DocumentDateUtils.java
  • src/main/java/org/sopt/kareer/global/external/clova/service/ClovaOcrService.java
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/main/java/org/sopt/kareer/global/document/exception/DocumentErrorCode.java
  • src/main/java/org/sopt/kareer/domain/member/util/VisaOcrParser.java
  • src/main/java/org/sopt/kareer/global/document/service/DocumentProcessingService.java

@jeong1112 jeong1112 merged commit 1da641f into develop Mar 26, 2026
2 checks passed
@jeong1112 jeong1112 deleted the feat/#204-ocr branch March 26, 2026 12:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feat] 온보딩 OCR API 구현

1 participant